SwinNet: Swin Transformer Drives Edge-Aware RGB-D and RGB-T Salient Object Detection
نویسندگان
چکیده
Convolutional neural networks (CNNs) are good at extracting contexture features within certain receptive fields, while transformers can model the global long-range dependency features. By absorbing advantage of transformer and merit CNN, Swin Transformer shows strong feature representation ability. Based on it, we propose a cross-modality fusion model, SwinNet , for RGB-D RGB-T salient object detection. It is driven by to extract hierarchical features, boosted attention mechanism bridge gap between two modalities, guided edge information sharp contour object. To be specific, two-stream encoder first extracts multi-modality then spatial alignment channel re-calibration module presented optimize intra-level clarify fuzzy boundary, edge-guided decoder achieves inter-level under guidance The proposed outperforms state-of-the-art models datasets, showing that it provides more insight into complementarity task.
منابع مشابه
Elastic Edge Boxes for Object Proposal on RGB-D Images
Object proposal is utilized as a fundamental preprocessing of various multimedia applications by detecting the candidate regions of objects in images. In this paper, we propose a novel object proposal method, named elastic edge boxes, integrating window scoring and grouping strategies and utilizing both color and depth cues in RGBD images. We first efficiently generate the initial bounding boxe...
متن کاملObject proposal on RGB-D images via elastic edge boxes
As a fundamental preprocessing of various multimedia applications, object proposal aims to detect the candidate windows possibly containing arbitrary objects in images with two typical strategies, window scoring and grouping. In this paper, we first analyze the feasibility of improving object proposal performance by integrating window scoring and grouping strategies. Then, we propose a novel ob...
متن کاملRGB-D Salient Object Detection Based on Discriminative Cross-modal Transfer Learning
In this work, we propose to utilize Convolutional Neural Networks (CNNs) to boost the performance of depth-induced salient object detection by capturing the high-level representative features for depth modality. We formulate the depth-induced saliency detection as a CNN-based cross-modal transfer problem to bridge the gap between the " data-hungry " nature of CNNs and the unavailability of suff...
متن کاملLocal Background Enclosure for RGB-D Salient Object Detection - Supplementary Results
The purpose of this supplementary material is to examine in detail the contributions of our proposed Local Background Enclosure (LBE) feature. A comparison of LBE with the contrast based depth features used in state-of-the-art salient object detection systems is presented. The LBE feature is compared with the raw depth features ACSD [1], DC [3] and a signed version of DC denoted SDC on the RGBD...
متن کاملDepth-aware CNN for RGB-D Segmentation
Convolutional neural networks (CNN) are limited by the lack of capability to handle geometric information due to the fixed grid kernel structure. The availability of depth data enables progress in RGB-D semantic segmentation with CNNs. State-of-the-art methods either use depth as additional images or process spatial information in 3D volumes or point clouds. These methods suffer from high compu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Circuits and Systems for Video Technology
سال: 2022
ISSN: ['1051-8215', '1558-2205']
DOI: https://doi.org/10.1109/tcsvt.2021.3127149